Overview

Dataset statistics

Number of variables7
Number of observations32797576
Missing cells98392728
Missing cells (%)42.9%
Duplicate rows235397
Duplicate rows (%)0.7%
Total size in memory2.0 GiB
Average record size in memory65.0 B

Variable types

Numeric6
Boolean1

Dataset

DescriptionPredict a neutrino particle’s direction. You will develop a model based on data from the 'IceCube' detector, which observes the cosmos from deep within the South Pole ice.
URL
Copyright(c) Mr. Eslam Fouad 2023

Alerts

Dataset has 235397 (0.7%) duplicate rowsDuplicates
x has 32792416 (> 99.9%) missing valuesMissing
y has 32792416 (> 99.9%) missing valuesMissing
z has 32792416 (> 99.9%) missing valuesMissing

Reproduction

Analysis started2023-06-13 14:34:39.250075
Analysis finished2023-06-13 14:49:15.258071
Duration14 minutes and 36.01 seconds
Software versionpandas-profiling v3.6.6
Download configurationconfig.json

Variables

sensor_id
Real number (ℝ)

Distinct5160
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2713.0239
Minimum0
Maximum5159
Zeros4067
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size500.5 MiB
2023-06-13T14:49:15.483931image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile286
Q11366
median2741
Q34096
95-th percentile5003
Maximum5159
Range5159
Interquartile range (IQR)2730

Descriptive statistics

Standard deviation1543.4085
Coefficient of variation (CV)0.56888863
Kurtosis-1.2531026
Mean2713.0239
Median Absolute Deviation (MAD)1369
Skewness-0.054415364
Sum8.8980609 × 1010
Variance2382109.8
MonotonicityNot monotonic
2023-06-13T14:49:15.935730image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5038 14561
 
< 0.1%
4979 14468
 
< 0.1%
4978 14321
 
< 0.1%
5037 14294
 
< 0.1%
4915 14134
 
< 0.1%
4918 14133
 
< 0.1%
4913 14003
 
< 0.1%
4976 13943
 
< 0.1%
5033 13942
 
< 0.1%
4858 13892
 
< 0.1%
Other values (5150) 32655885
99.6%
ValueCountFrequency (%)
0 4067
< 0.1%
1 4747
< 0.1%
2 4779
< 0.1%
3 4815
< 0.1%
4 4241
< 0.1%
5 3615
< 0.1%
6 3830
< 0.1%
7 3914
< 0.1%
8 4328
< 0.1%
9 5165
< 0.1%
ValueCountFrequency (%)
5159 13448
< 0.1%
5158 12522
< 0.1%
5157 12819
< 0.1%
5156 13281
< 0.1%
5155 13362
< 0.1%
5154 12683
< 0.1%
5153 12249
< 0.1%
5152 12394
< 0.1%
5151 11646
< 0.1%
5150 12061
< 0.1%

time
Real number (ℝ)

Distinct52433
Distinct (%)0.2%
Missing5160
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean13130.478
Minimum5714
Maximum77785
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size500.5 MiB
2023-06-13T14:49:16.364846image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum5714
5-th percentile8765
Q110566
median11815
Q313916
95-th percentile21428
Maximum77785
Range72071
Interquartile range (IQR)3350

Descriptive statistics

Standard deviation4876.7966
Coefficient of variation (CV)0.37141043
Kurtosis15.177769
Mean13130.478
Median Absolute Deviation (MAD)1475
Skewness3.2034347
Sum4.305801 × 1011
Variance23783145
MonotonicityNot monotonic
2023-06-13T14:49:16.781899image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9885 11999
 
< 0.1%
9882 11974
 
< 0.1%
9887 11956
 
< 0.1%
9884 11944
 
< 0.1%
9888 11907
 
< 0.1%
9890 11743
 
< 0.1%
9883 11691
 
< 0.1%
9886 11682
 
< 0.1%
9880 11641
 
< 0.1%
9889 11625
 
< 0.1%
Other values (52423) 32674254
99.6%
ValueCountFrequency (%)
5714 4
 
< 0.1%
5715 8
 
< 0.1%
5716 21
< 0.1%
5717 27
< 0.1%
5718 26
< 0.1%
5719 30
< 0.1%
5720 35
< 0.1%
5721 48
< 0.1%
5722 45
< 0.1%
5723 43
< 0.1%
ValueCountFrequency (%)
77785 1
< 0.1%
76736 1
< 0.1%
76151 1
< 0.1%
75889 1
< 0.1%
75814 1
< 0.1%
75550 1
< 0.1%
75208 1
< 0.1%
75148 1
< 0.1%
75013 1
< 0.1%
75006 1
< 0.1%

charge
Real number (ℝ)

Distinct8661
Distinct (%)< 0.1%
Missing5160
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean3.9089811
Minimum0.025
Maximum2762.0249
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size500.5 MiB
2023-06-13T14:49:17.209781image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum0.025
5-th percentile0.375
Q10.77499998
median1.075
Q31.775
95-th percentile12.125
Maximum2762.0249
Range2761.9999
Interquartile range (IQR)1

Descriptive statistics

Standard deviation16.288969
Coefficient of variation (CV)4.1670627
Kurtosis846.18837
Mean3.9089811
Median Absolute Deviation (MAD)0.39999998
Skewness16.464331
Sum1.2818493 × 108
Variance265.33052
MonotonicityNot monotonic
2023-06-13T14:49:17.624589image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.9750000238 1359439
 
4.1%
0.9250000119 1353154
 
4.1%
1.024999976 1326740
 
4.0%
0.875 1319422
 
4.0%
1.075000048 1265480
 
3.9%
0.8249999881 1257551
 
3.8%
1.125 1182114
 
3.6%
0.7749999762 1161263
 
3.5%
0.7250000238 1075470
 
3.3%
1.174999952 1073846
 
3.3%
Other values (8651) 20417937
62.3%
ValueCountFrequency (%)
0.02500000037 371
 
< 0.1%
0.07500000298 3957
 
< 0.1%
0.125 121103
 
0.4%
0.174999997 181625
 
0.6%
0.224999994 335813
1.0%
0.275000006 407549
1.2%
0.3249999881 427386
1.3%
0.375 465653
1.4%
0.4250000119 556332
1.7%
0.474999994 629413
1.9%
ValueCountFrequency (%)
2762.024902 1
< 0.1%
2728.024902 1
< 0.1%
2600.425049 1
< 0.1%
2595.375 1
< 0.1%
2555.125 1
< 0.1%
2508.024902 1
< 0.1%
2504.074951 1
< 0.1%
2439.574951 1
< 0.1%
2410.375 1
< 0.1%
2401.675049 1
< 0.1%

auxiliary
Boolean

Distinct2
Distinct (%)< 0.1%
Missing5160
Missing (%)< 0.1%
Memory size500.5 MiB
False
23551893 
True
9240523 
(Missing)
 
5160
ValueCountFrequency (%)
False 23551893
71.8%
True 9240523
 
28.2%
(Missing) 5160
 
< 0.1%
2023-06-13T14:49:18.070772image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

x
Real number (ℝ)

Distinct118
Distinct (%)2.3%
Missing32792416
Missing (%)> 99.9%
Infinite0
Infinite (%)0.0%
Mean5.8708295
Minimum-570.9
Maximum576.37
Zeros0
Zeros (%)0.0%
Negative2460
Negative (%)< 0.1%
Memory size500.5 MiB
2023-06-13T14:49:18.434175image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-570.9
5-th percentile-447.74
Q1-224.09
median16.99
Q3224.58
95-th percentile472.05
Maximum576.37
Range1147.27
Interquartile range (IQR)448.67

Descriptive statistics

Standard deviation285.15121
Coefficient of variation (CV)48.570856
Kurtosis-0.86208205
Mean5.8708295
Median Absolute Deviation (MAD)224.565
Skewness-0.0028952251
Sum30293.48
Variance81311.214
MonotonicityNot monotonic
2023-06-13T14:49:18.881143image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-279.53 60
 
< 0.1%
11.87 60
 
< 0.1%
-111.51 60
 
< 0.1%
-234.95 60
 
< 0.1%
-358.44 60
 
< 0.1%
-481.6 60
 
< 0.1%
576.37 60
 
< 0.1%
472.05 60
 
< 0.1%
330.03 60
 
< 0.1%
195.03 60
 
< 0.1%
Other values (108) 4560
 
< 0.1%
(Missing) 32792416
> 99.9%
ValueCountFrequency (%)
-570.9 60
< 0.1%
-526.63 60
< 0.1%
-492.43 60
< 0.1%
-481.6 60
< 0.1%
-447.74 60
< 0.1%
-437.04 60
< 0.1%
-413.46 60
< 0.1%
-403.14 60
< 0.1%
-392.38 60
< 0.1%
-368.93 60
< 0.1%
ValueCountFrequency (%)
576.37 60
< 0.1%
544.07 60
< 0.1%
505.27 60
< 0.1%
500.43 60
< 0.1%
472.05 60
< 0.1%
444.05 1
 
< 0.1%
444 1
 
< 0.1%
443.96 5
 
< 0.1%
443.95 2
 
< 0.1%
443.94 1
 
< 0.1%

y
Real number (ℝ)

Distinct117
Distinct (%)2.3%
Missing32792416
Missing (%)> 99.9%
Infinite0
Infinite (%)0.0%
Mean-2.5186085
Minimum-521.08
Maximum509.5
Zeros0
Zeros (%)0.0%
Negative2580
Negative (%)< 0.1%
Memory size500.5 MiB
2023-06-13T14:49:19.316516image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-521.08
5-th percentile-442.42
Q1-209.07
median-6.055
Q3211.66
95-th percentile451.52
Maximum509.5
Range1030.58
Interquartile range (IQR)420.73

Descriptive statistics

Standard deviation269.40973
Coefficient of variation (CV)-106.96769
Kurtosis-0.8709524
Mean-2.5186085
Median Absolute Deviation (MAD)203.945
Skewness0.015403941
Sum-12996.02
Variance72581.602
MonotonicityNot monotonic
2023-06-13T14:49:19.746385image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
23.17 60
 
< 0.1%
179.19 60
 
< 0.1%
159.98 60
 
< 0.1%
140.44 60
 
< 0.1%
120.56 60
 
< 0.1%
101.39 60
 
< 0.1%
170.92 60
 
< 0.1%
127.9 60
 
< 0.1%
127.2 60
 
< 0.1%
125.59 60
 
< 0.1%
Other values (107) 4560
 
< 0.1%
(Missing) 32792416
> 99.9%
ValueCountFrequency (%)
-521.08 60
< 0.1%
-501.45 60
< 0.1%
-481.74 60
< 0.1%
-461.99 60
< 0.1%
-442.42 60
< 0.1%
-424.5 60
< 0.1%
-422.83 60
< 0.1%
-404.48 60
< 0.1%
-384.3 60
< 0.1%
-364.83 60
< 0.1%
ValueCountFrequency (%)
509.5 60
< 0.1%
490.22 60
< 0.1%
470.86 60
< 0.1%
463.72 60
< 0.1%
451.52 60
< 0.1%
432.35 60
< 0.1%
412.79 60
< 0.1%
393.24 60
< 0.1%
374.24 60
< 0.1%
354.24 60
< 0.1%

z
Real number (ℝ)

Distinct4975
Distinct (%)96.4%
Missing32792416
Missing (%)> 99.9%
Infinite0
Infinite (%)0.0%
Mean-23.905766
Minimum-512.82
Maximum524.56
Zeros0
Zeros (%)0.0%
Negative2736
Negative (%)< 0.1%
Memory size500.5 MiB
2023-06-13T14:49:20.155017image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Quantile statistics

Minimum-512.82
5-th percentile-467.4815
Q1-283.2
median-35.115
Q3228.5575
95-th percentile451.284
Maximum524.56
Range1037.38
Interquartile range (IQR)511.7575

Descriptive statistics

Standard deviation296.45656
Coefficient of variation (CV)-12.401049
Kurtosis-1.2180985
Mean-23.905766
Median Absolute Deviation (MAD)257.6
Skewness0.10163346
Sum-123353.75
Variance87886.493
MonotonicityNot monotonic
2023-06-13T14:49:20.529466image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
41.92 2
 
< 0.1%
-369.39 2
 
< 0.1%
484.44 2
 
< 0.1%
-176.41 2
 
< 0.1%
450.4 2
 
< 0.1%
433.38 2
 
< 0.1%
416.36 2
 
< 0.1%
399.34 2
 
< 0.1%
382.32 2
 
< 0.1%
365.3 2
 
< 0.1%
Other values (4965) 5140
 
< 0.1%
(Missing) 32792416
> 99.9%
ValueCountFrequency (%)
-512.82 1
< 0.1%
-510.57 1
< 0.1%
-510.18 1
< 0.1%
-509.09 1
< 0.1%
-508.41 1
< 0.1%
-507.4 1
< 0.1%
-507.28 1
< 0.1%
-507.16 1
< 0.1%
-506.97 1
< 0.1%
-506.62 1
< 0.1%
ValueCountFrequency (%)
524.56 1
< 0.1%
523.42 1
< 0.1%
516.67 1
< 0.1%
512.95 1
< 0.1%
512.74 1
< 0.1%
507.53 1
< 0.1%
506.4 1
< 0.1%
506.23 1
< 0.1%
506.14 1
< 0.1%
505.72 1
< 0.1%

Interactions

2023-06-13T14:46:23.540424image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:10.148437image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:34.397950image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:59.398546image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:14.672945image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:19.082464image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:24.437728image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:18.181302image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:41.389330image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:05.277824image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:15.389162image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:19.881941image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:25.108268image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:23.942298image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:49.007273image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:11.808616image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:16.274075image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:20.610085image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:25.873581image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:24.803385image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:49.920722image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:12.502213image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:16.818020image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:21.403338image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:26.775524image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:25.673841image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:50.788858image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:13.151474image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:17.389001image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:21.967794image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:27.329233image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:26.502928image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:45:51.658487image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:14.029164image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:18.251019image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
2023-06-13T14:46:22.720112image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/

Missing values

2023-06-13T14:46:38.481133image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
A simple visualization of nullity by column.
2023-06-13T14:47:07.737877image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2023-06-13T14:48:33.279566image/svg+xmlMatplotlib v3.7.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

sensor_idtimechargeauxiliaryxyz
2439185928.01.325TrueNaNNaNNaN
2441576115.01.175TrueNaNNaNNaN
2435206492.00.925TrueNaNNaNNaN
2450416665.00.225TrueNaNNaNNaN
2429488054.01.575TrueNaNNaNNaN
248608124.00.675TrueNaNNaNNaN
2424408284.01.625TrueNaNNaNNaN
2417438478.00.775TrueNaNNaNNaN
2436098572.01.025TrueNaNNaNNaN
2450578680.03.975TrueNaNNaNNaN
sensor_idtimechargeauxiliaryxyz
51505150NaNNaNNaN-10.976.72-437.34
51515151NaNNaNNaN-10.976.72-444.35
51525152NaNNaNNaN-10.976.72-451.36
51535153NaNNaNNaN-10.976.72-458.37
51545154NaNNaNNaN-10.976.72-465.38
51555155NaNNaNNaN-10.976.72-472.39
51565156NaNNaNNaN-10.976.72-479.39
51575157NaNNaNNaN-10.976.72-486.40
51585158NaNNaNNaN-10.976.72-493.41
51595159NaNNaNNaN-10.976.72-500.73

Duplicate rows

Most frequently occurring

sensor_idtimechargeauxiliaryxyz# duplicates
338547799880.00.975FalseNaNNaNNaN5
11429727029884.01.075FalseNaNNaNNaN5
12658330009897.00.875FalseNaNNaNNaN5
16118838409883.00.925FalseNaNNaNNaN5
16648039669969.00.925FalseNaNNaNNaN5
3583729863.00.725FalseNaNNaNNaN4
717516512213.00.875FalseNaNNaNNaN4
82071809887.00.575FalseNaNNaNNaN4
88111939881.01.075FalseNaNNaNNaN4
133703019853.00.325FalseNaNNaNNaN4